Exploratory Data Analysis using {gtsummary} package

Motivation

  • The replication crisis (also called the replicability crisis and the reproducibility crisis) is an ongoing methodological crisis in which the results of many scientific studies are difficult or impossible to reproduce

  • Reproducible crisis lead to

    • Low quality of medical research
    • low quality code and contain errors
    • Reproducibility is frequently laborious and time-consuming.
**Raw data to summary table**

Raw data to summary table

**SPSS output to summary table**

SPSS output to summary table

**R output to summary table**

R output to summary table

  • Thus, gtsummary package were developed to help the non-coder R users to produce a presentation ready table that are reproducible and customizable.
Image source: Happy R adapted from artwork by @allison_horst; the beach and cocktail images are from pngtree.com

Image source: Happy R adapted from artwork by @allison_horst; the beach and cocktail images are from pngtree.com

Introduction

Overview of {gtsummary} package

  • The {gtsummary} package provides an elegant and flexible way to create publication-ready analytical and summary tables using the R programming language.
  • A package developed by Daniel D.Sjoberg et al. 
  • Use gt package as a background to produced a highly reproducible and presentation ready table.
  • Latest version: 1.7.2 (2023-07-15)
  • Requirement: R ≥ 3.4

Import Packages

broom (>= 0.8.0) broom.helpers (>= 1.9.0) cli (>= 3.1.1)
dplyr (>= 1.0.7) forcats (>= 0.5.1) glue (>= 1.6.0)
gt (>= 0.7.0) knitr (>= 1.37) lifecycle (>= 1.0.1)
purrr (>= 0.3.4) rlang (>= 1.0.3) stringr (>= 1.4.0)
tibble (>= 3.1.6) tidyr (>= 1.1.4)

Function

  • Creates default tabular summaries with highly customizable capabilities
  • Summarize data frames (survival data, survey data)
  • Cross-tabulation
  • Summarize regression models (linear, logistics and survival)
  • Report statistics from gtsummary tables in-line in Rmarkdown
  • Stack and /or merge any table type
  • Standardize themes across tables
  • Choose different print engines

Analysis Workflow

Function Customization Print engines Theme
gtsummary function Data arrangement Additional information Table cosmetics coding coding
tbl_summary by: add_* modify_* as_gt reset_gtsummary_theme
tbl_cross type: bold_* as_flex_table theme_gtsummary_journal(journal = “lancet”) can choose “lancet”,“jama” and others
tbl_uvregression statistics: italicize_* as_hux_table
tbl_regression lable: as_kable_extra
tbl_merge as_kable
tbl_stack as_tibble
library(haven)
stroke <- read.csv("rconf.csv", stringsAsFactors = TRUE)

stroke$age <- as.numeric(stroke$age)

summary(stroke)
##       age            sex        ethnicity      married       dm        hpt     
##  Min.   :24.00   Female:123   Chinese: 34   Divorce:  8   No  :170   No  : 69  
##  1st Qu.:54.00   Male  :196   Indian : 25   Married:168   Yes :137   Yes :242  
##  Median :63.00                Malay  :259   Single :  6   NA's: 12   NA's:  8  
##  Mean   :63.12                Others :  1   NA's   :137                        
##  3rd Qu.:73.00                                                                 
##  Max.   :95.00                                                                 
##  NA's   :8                                                                     
##    ckd         af       hf.IHD     lipid      smoke    
##      :129       : 14       :129   No  :221   No  :285  
##  No  :189   No  :292   No  :175   Yes : 85   Yes : 29  
##  Yes :  1   Yes : 13   Yes : 15   NA's: 13   NA's:  5  
##                                                        
##                                                        
##                                                        
##                                                        
##                              WHO             dodiag         gcs     
##  ICH                           : 14   Missing   : 26   15     :243  
##  Intracerebral Hemorrhage (ICH): 21   15/03/2021:  6   10     : 15  
##  Ischaemic                     :281   01/03/2021:  4   13     : 12  
##  SAH                           :  3   03/01/2021:  4   11     : 11  
##                                       03/03/2021:  4   12     : 10  
##                                       04/01/2021:  4   9      :  9  
##                                       (Other)   :271   (Other): 19  
##                                nihss                               mrs    
##  Minor stroke (1-4)               :130   Moderately severe disability:85  
##  Moderate stroke (5-15)           : 86   No significant disability   :60  
##  Moderate to severe stroke (16-20): 20   Slight disability           :53  
##  No stroke symptoms (0)           : 45   Severe disability           :37  
##  Severe stroke (21-42)            : 17   Moderate disability         :35  
##  NA's                             : 21   (Other)                     :23  
##                                          NA's                        :26  
##  iv_thrombolysis iv_thrombectomy status_dis         dodis     status_f.u 
##  No :315         No:319          Alive:306             :129   Alive:228  
##  Yes:  4                         Death: 13   25/01/2018:  4   Died : 91  
##                                              02/03/2021:  3              
##                                              07/10/2020:  3              
##                                              08/01/2018:  3              
##                                              10/01/2018:  3              
##                                              (Other)   :174              
##        dodeath         Sebab.Kematian
##  22/11/2022:228               :228   
##  01/06/2021:  2   SAKIT TUA   : 36   
##  08/02/2021:  2   STROK       :  5   
##  08/04/2022:  2   SAKIT STROK :  4   
##  12/03/2021:  2   STROKE      :  4   
##  12/06/2022:  2   DARAH TINGGI:  2   
##  (Other)   : 81   (Other)     : 40
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(gtsummary)

stroke$dodiag <- as.Date(stroke$dodiag)
stroke$dodeath <- as.Date(stroke$dodeath)
stroke$dodis <- as.Date(stroke$dodis)
stroke <- stroke %>% mutate(dur = stroke$dodiag %--% stroke$dodeath) %>% 
  mutate(dur = as.duration(dur))

stroke <- stroke %>% mutate(dur_days = dur/ddays(1))

stroke <- stroke %>% mutate(dur_month = dur/ddays(1)/30.417)

str(stroke)
## 'data.frame':    319 obs. of  26 variables:
##  $ age            : num  52 73 64 58 64 64 66 52 76 45 ...
##  $ sex            : Factor w/ 2 levels "Female","Male ": 2 2 2 2 2 1 2 2 2 2 ...
##  $ ethnicity      : Factor w/ 4 levels "Chinese","Indian",..: 3 3 3 3 3 3 3 3 1 3 ...
##  $ married        : Factor w/ 3 levels "Divorce","Married",..: 2 2 2 2 2 2 1 2 2 2 ...
##  $ dm             : Factor w/ 2 levels "No","Yes ": 1 2 2 2 1 2 2 1 1 2 ...
##  $ hpt            : Factor w/ 2 levels "No","Yes ": 2 2 1 2 2 2 1 1 2 1 ...
##  $ ckd            : Factor w/ 3 levels "","No","Yes ": 2 2 2 2 2 2 2 2 2 2 ...
##  $ af             : Factor w/ 3 levels "","No","Yes ": 2 2 2 2 2 2 2 2 2 2 ...
##  $ hf.IHD         : Factor w/ 3 levels "","No","Yes ": 2 2 3 2 2 2 2 2 2 2 ...
##  $ lipid          : Factor w/ 2 levels "No","Yes": 1 1 1 2 2 1 1 1 1 1 ...
##  $ smoke          : Factor w/ 2 levels "No","Yes ": 1 1 1 1 1 1 1 1 1 1 ...
##  $ WHO            : Factor w/ 4 levels "ICH","Intracerebral Hemorrhage (ICH)",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ dodiag         : Date, format: "0012-09-20" "0022-09-20" ...
##  $ gcs            : Factor w/ 13 levels "10","11","12",..: 6 4 6 6 6 3 6 6 6 6 ...
##  $ nihss          : Factor w/ 5 levels "Minor stroke (1-4)",..: 2 2 2 2 1 1 1 2 1 1 ...
##  $ mrs            : Factor w/ 7 levels "Died (dischrage)",..: 3 7 6 3 4 7 7 3 7 4 ...
##  $ iv_thrombolysis: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ iv_thrombectomy: Factor w/ 1 level "No": 1 1 1 1 1 1 1 1 1 1 ...
##  $ status_dis     : Factor w/ 2 levels "Alive","Death": 1 1 1 1 1 1 1 1 1 1 ...
##  $ dodis          : Date, format: "0018-09-20" "0024-09-20" ...
##  $ status_f.u     : Factor w/ 2 levels "Alive","Died": 1 1 2 1 1 2 1 1 1 1 ...
##  $ dodeath        : Date, format: "0022-11-20" "0022-11-20" ...
##  $ Sebab.Kematian : Factor w/ 44 levels "","ACUTE HEMORRHAGIC STROKE",..: 1 1 26 1 1 40 1 1 1 1 ...
##  $ dur            :Formal class 'Duration' [package "lubridate"] with 1 slot
##   .. ..@ .Data: num  3.21e+08 5.27e+06 -4.29e+08 5.65e+08 5.34e+08 ...
##  $ dur_days       : num  3713 61 -4962 6544 6179 ...
##  $ dur_month      : num  122.07 2.01 -163.13 215.14 203.14 ...
stroke2 <- stroke %>% select(age, sex, dm, hpt, nihss, status_f.u, dur_days,dur_month)
stroke2_complete <- na.omit(stroke2)
stroke2_complete$nihss <- relevel(stroke2_complete$nihss, ref = "No stroke symptoms (0)")

Application and examples

Descriptive analysis

  1. tbl_summary()
library(gtsummary)
tbl_summary(stroke2_complete)
Characteristic N = 2661
age 64 (55, 73)
sex
    Female 103 (39%)
    Male 163 (61%)
dm
    No 138 (52%)
    Yes 128 (48%)
hpt
    No 48 (18%)
    Yes 218 (82%)
nihss
    No stroke symptoms (0) 19 (7.1%)
    Minor stroke (1-4) 126 (47%)
    Moderate stroke (5-15) 85 (32%)
    Moderate to severe stroke (16-20) 19 (7.1%)
    Severe stroke (21-42) 17 (6.4%)
status_f.u
    Alive 186 (70%)
    Died 80 (30%)
dur_days 2,191 (-311, 5,075)
dur_month 72 (-10, 167)
1 Median (IQR); n (%)
  1. tbl_cross()
tbl_cross(stroke2_complete,
          row = sex,
          col = status_f.u,
          percent = "row",
          margin = "row") %>% 
  add_p(source_note = TRUE)
status_f.u
Alive Died
sex
    Female 62 (60%) 41 (40%)
    Male 124 (76%) 39 (24%)
Total 186 (70%) 80 (30%)
Pearson’s Chi-squared test, p=0.006

Binary logistic regression

Default output for regression analysis in R

default_stroke <- glm(status_f.u~age + sex, data = stroke2_complete, family = binomial(link = logit))

summary(default_stroke)
## 
## Call:
## glm(formula = status_f.u ~ age + sex, family = binomial(link = logit), 
##     data = stroke2_complete)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -4.93939    0.89754  -5.503 3.73e-08 ***
## age          0.06785    0.01276   5.316 1.06e-07 ***
## sexMale     -0.64117    0.29368  -2.183    0.029 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 325.32  on 265  degrees of freedom
## Residual deviance: 283.23  on 263  degrees of freedom
## AIC: 289.23
## 
## Number of Fisher Scoring iterations: 4

Logistic regression using gtsummary-univariate analysis

uvlog <- stroke2_complete %>% select(age, sex, dm, hpt, nihss, status_f.u) %>% 
  tbl_uvregression(method = glm,
                   y=status_f.u,
                   method.args = list(family = binomial),
                   exponentiate = TRUE)

uvlog
Characteristic N OR1 95% CI1 p-value
age 266 1.07 1.05, 1.10 <0.001
sex 266
    Female
    Male 0.48 0.28, 0.81 0.006
dm 266
    No
    Yes 1.85 1.09, 3.16 0.024
hpt 266
    No
    Yes 1.56 0.77, 3.37 0.2
nihss 266
    No stroke symptoms (0)
    Minor stroke (1-4) 0.32 0.11, 1.00 0.040
    Moderate stroke (5-15) 1.44 0.52, 4.45 0.5
    Moderate to severe stroke (16-20) 2.98 0.81, 11.9 0.11
    Severe stroke (21-42) 7.04 1.72, 34.6 0.010
1 OR = Odds Ratio, CI = Confidence Interval

Logistic regression using gtsummary-multivariate analysis

mvlog <- glm(status_f.u~age+sex+dm+hpt+nihss,
             stroke2_complete, 
             family = binomial)
mvlog %>% tbl_regression(
  exponentiate=TRUE
)
Characteristic OR1 95% CI1 p-value
age 1.07 1.04, 1.10 <0.001
sex
    Female
    Male 0.61 0.32, 1.16 0.13
dm
    No
    Yes 1.73 0.89, 3.41 0.11
hpt
    No
    Yes 0.74 0.29, 1.91 0.5
nihss
    No stroke symptoms (0)
    Minor stroke (1-4) 0.44 0.13, 1.55 0.2
    Moderate stroke (5-15) 1.92 0.61, 6.62 0.3
    Moderate to severe stroke (16-20) 4.54 1.03, 22.0 0.051
    Severe stroke (21-42) 6.28 1.30, 35.8 0.028
1 OR = Odds Ratio, CI = Confidence Interval

Survival analysis

Survival rate using tbl_survfit

library(survival)
library(gtsummary)
fit1 <- survfit(Surv(dur_month, status_f.u)~ 1, stroke2_complete)
fit2 <- survfit(Surv(dur_month, status_f.u)~ sex, stroke2_complete)

life_table <- list(fit1, fit2) %>% 
  tbl_survfit(times= c(1, 12, 36)) %>% 
  modify_header(update = list(
    stat_1 ~ "**1-month**",
    stat_2 ~ "**1-year**",
    stat_3 ~ "**3-years**"
  )) %>% 
  add_n() 
## tbl_survfit: Multi-state model detected. Showing probabilities into state 'Died'
## tbl_survfit: Multi-state model detected. Showing probabilities into state 'Died'
life_table
Characteristic N 1-month 1-year 3-years
Overall 266 0% (0%, 0%) 11% (8.0%, 16%) 14% (10%, 19%)
sex 266
    Female 0% (0%, 0%) 16% (10%, 25%) 18% (12%, 28%)
    Male 0% (0%, 0%) 8.4% (5.0%, 14%) 11% (6.8%, 17%)

Semi-parametric survival using gtsummary-univariate analysis

library(survival)
cox_uv <- tbl_uvregression(
  stroke2_complete[c("dur_month", "status_f.u", "age", "sex", "dm",
             "hpt", "nihss")],
  method = coxph,
  y = Surv(time = dur_month, event = status_f.u=='Died'),
  exponentiate = TRUE
)

cox_uv
Characteristic N HR1 95% CI1 p-value
age 266 1.05 1.03, 1.06 <0.001
sex 266
    Female
    Male 0.48 0.30, 0.74 0.001
dm 266
    No
    Yes 1.53 0.97, 2.39 0.066
hpt 266
    No
    Yes 1.32 0.70, 2.50 0.4
nihss 266
    No stroke symptoms (0)
    Minor stroke (1-4) 0.38 0.15, 0.99 0.047
    Moderate stroke (5-15) 1.32 0.55, 3.16 0.5
    Moderate to severe stroke (16-20) 1.66 0.61, 4.50 0.3
    Severe stroke (21-42) 4.12 1.56, 10.9 0.004
1 HR = Hazard Ratio, CI = Confidence Interval

Semi-parametric survival using gtsummary-multivariate analysis

cox_mv <- coxph(Surv(time = dur_month, event = status_f.u=='Died')~
                  age+sex+dm+hpt+nihss,
             stroke2_complete) %>% 
  tbl_regression(exponentiate=TRUE)
  
cox_mv
Characteristic HR1 95% CI1 p-value
age 1.04 1.02, 1.06 <0.001
sex
    Female
    Male 0.60 0.38, 0.96 0.033
dm
    No
    Yes 1.88 1.14, 3.10 0.013
hpt
    No
    Yes 0.60 0.30, 1.18 0.14
nihss
    No stroke symptoms (0)
    Minor stroke (1-4) 0.68 0.25, 1.88 0.5
    Moderate stroke (5-15) 2.20 0.88, 5.53 0.092
    Moderate to severe stroke (16-20) 2.64 0.93, 7.48 0.067
    Severe stroke (21-42) 7.40 2.56, 21.4 <0.001
1 HR = Hazard Ratio, CI = Confidence Interval

Customization

**Customization**

Customization

**{gtsummary} + formulas**

{gtsummary} + formulas

Data arrangement

stroke3 <-  stroke2_complete %>% select(age, sex, dm, hpt, nihss)
desc <- tbl_summary(stroke3,
            by    = sex,
            label = list( age                 ~    "Age",
                          sex                 ~    "Gender",
                          dm                  ~    "Diabetes Mellitus",
                          hpt                 ~    "Hypertention",
                          nihss               ~    "NIHSS Score"),
            digits =      c(all_continuous()  ~    1,
                            all_categorical() ~    0),
            statistic =   c(all_categorical() ~    "{n} ({p}%)",
                            all_continuous()  ~    "{mean} ({sd})"))

desc 
Characteristic Female, N = 1031 **Male **, N = 1631
Age 65.1 (14.9) 62.9 (11.8)
Diabetes Mellitus
    No 51 (50%) 87 (53%)
    Yes 52 (50%) 76 (47%)
Hypertention
    No 13 (13%) 35 (21%)
    Yes 90 (87%) 128 (79%)
NIHSS Score
    No stroke symptoms (0) 6 (6%) 13 (8%)
    Minor stroke (1-4) 45 (44%) 81 (50%)
    Moderate stroke (5-15) 32 (31%) 53 (33%)
    Moderate to severe stroke (16-20) 9 (9%) 10 (6%)
    Severe stroke (21-42) 11 (11%) 6 (4%)
1 Mean (SD); n (%)

Add extra information

#for descriptive table
desc %>% 
  add_n %>% 
  add_p %>% 
  add_q 
## add_q: Adjusting p-values with
## `stats::p.adjust(x$table_body$p.value, method = "fdr")`
Characteristic N Female, N = 1031 **Male **, N = 1631 p-value2 q-value3
Age 266 65.1 (14.9) 62.9 (11.8) 0.12 0.2
Diabetes Mellitus 266 0.5 0.5
    No 51 (50%) 87 (53%)
    Yes 52 (50%) 76 (47%)
Hypertention 266 0.067 0.2
    No 13 (13%) 35 (21%)
    Yes 90 (87%) 128 (79%)
NIHSS Score 266 0.2 0.2
    No stroke symptoms (0) 6 (6%) 13 (8%)
    Minor stroke (1-4) 45 (44%) 81 (50%)
    Moderate stroke (5-15) 32 (31%) 53 (33%)
    Moderate to severe stroke (16-20) 9 (9%) 10 (6%)
    Severe stroke (21-42) 11 (11%) 6 (4%)
1 Mean (SD); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
3 False discovery rate correction for multiple testing

Aesthethic

desc %>% 
  add_n %>% 
  add_p %>% 
  add_q %>% 
  bold_labels() %>% 
  italicize_levels()
## add_q: Adjusting p-values with
## `stats::p.adjust(x$table_body$p.value, method = "fdr")`
Characteristic N Female, N = 1031 **Male **, N = 1631 p-value2 q-value3
Age 266 65.1 (14.9) 62.9 (11.8) 0.12 0.2
Diabetes Mellitus 266 0.5 0.5
    No 51 (50%) 87 (53%)
    Yes 52 (50%) 76 (47%)
Hypertention 266 0.067 0.2
    No 13 (13%) 35 (21%)
    Yes 90 (87%) 128 (79%)
NIHSS Score 266 0.2 0.2
    No stroke symptoms (0) 6 (6%) 13 (8%)
    Minor stroke (1-4) 45 (44%) 81 (50%)
    Moderate stroke (5-15) 32 (31%) 53 (33%)
    Moderate to severe stroke (16-20) 9 (9%) 10 (6%)
    Severe stroke (21-42) 11 (11%) 6 (4%)
1 Mean (SD); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
3 False discovery rate correction for multiple testing
#may add bold_p if there is significant differenc with p value of <0.05

Merging and stacking

**Stacking and merging**

Stacking and merging

Merging

cox_uv <- tbl_uvregression(
  stroke2_complete[c("dur_month", "status_f.u", "age", "sex", "dm",
             "hpt", "nihss")],
  method = coxph,
  y = Surv(time = dur_month, event = status_f.u=='Died'),
  exponentiate = TRUE,
  label = list( age                 ~    "Age",
                          sex                 ~    "Gender",
                          dm                  ~    "Diabetes Mellitus",
                          hpt                 ~    "Hypertention",
                          nihss               ~    "NIHSS Score"),
)
cox_mv <- coxph(Surv(time = dur_month, event = status_f.u=='Died')~
                  age+sex+dm+hpt+nihss,
             stroke2_complete) %>% 
  tbl_regression(exponentiate=TRUE,
                 label = list( age                 ~    "Age",
                          sex                 ~    "Gender",
                          dm                  ~    "Diabetes Mellitus",
                          hpt                 ~    "Hypertention",
                          nihss               ~    "NIHSS Score"),)
tbl_surv_merge <- tbl_merge(
  list(cox_uv, cox_mv),
  tab_spanner = c("**Univariable**","**Multivariable**")
)
tbl_surv_merge
Characteristic Univariable Multivariable
N HR1 95% CI1 p-value HR1 95% CI1 p-value
Age 266 1.05 1.03, 1.06 <0.001 1.04 1.02, 1.06 <0.001
Gender 266
    Female
    Male 0.48 0.30, 0.74 0.001 0.60 0.38, 0.96 0.033
Diabetes Mellitus 266
    No
    Yes 1.53 0.97, 2.39 0.066 1.88 1.14, 3.10 0.013
Hypertention 266
    No
    Yes 1.32 0.70, 2.50 0.4 0.60 0.30, 1.18 0.14
NIHSS Score 266
    No stroke symptoms (0)
    Minor stroke (1-4) 0.38 0.15, 0.99 0.047 0.68 0.25, 1.88 0.5
    Moderate stroke (5-15) 1.32 0.55, 3.16 0.5 2.20 0.88, 5.53 0.092
    Moderate to severe stroke (16-20) 1.66 0.61, 4.50 0.3 2.64 0.93, 7.48 0.067
    Severe stroke (21-42) 4.12 1.56, 10.9 0.004 7.40 2.56, 21.4 <0.001
1 HR = Hazard Ratio, CI = Confidence Interval

Stacking

t1 <- glm(status_f.u~sex,
          data = stroke2_complete,
          family = binomial) %>% 
  tbl_regression(
    exponentiate=TRUE,
    label=list(sex ~"Gender (unadjusted)")
  )

t2 <- glm(status_f.u~sex+age+dm+hpt,
          data = stroke2_complete,
          family = binomial) %>% 
  tbl_regression(
    include="sex",
    exponentiate=TRUE,
    label=list(sex ~"Gender (adjusted)")
  )

table_stack_ex1 <- tbl_stack(list(t1, t2))
table_stack_ex1
Characteristic OR1 95% CI1 p-value
Gender (unadjusted)
    Female
    Male 0.48 0.28, 0.81 0.006
Gender (adjusted)
    Female
    Male 0.53 0.30, 0.95 0.032
1 OR = Odds Ratio, CI = Confidence Interval

Journal table format

reset_gtsummary_theme()
theme_gtsummary_journal(journal = "lancet")
## Setting theme `The Lancet`
lancet_theme <- tbl_surv_merge %>% 
  bold_labels() %>% 
  italicize_levels() %>%
  as_gt() %>% 
  gt::tab_header("Journal Theme (Lancet)")

lancet_theme
Journal Theme (Lancet)
Characteristic Univariable Multivariable
N HR1 95% CI1 p-value HR1 95% CI1 p-value
Age 266 1·05 1.03, 1.06 <0·001 1·04 1.02, 1.06 <0·001
Gender 266
    Female
    Male 0·48 0.30, 0.74 0·001 0·60 0.38, 0.96 0·033
Diabetes Mellitus 266
    No
    Yes 1·53 0.97, 2.39 0·066 1·88 1.14, 3.10 0·013
Hypertention 266
    No
    Yes 1·32 0.70, 2.50 0·4 0·60 0.30, 1.18 0·14
NIHSS Score 266
    No stroke symptoms (0)
    Minor stroke (1-4) 0·38 0.15, 0.99 0·047 0·68 0.25, 1.88 0·5
    Moderate stroke (5-15) 1·32 0.55, 3.16 0·5 2·20 0.88, 5.53 0·092
    Moderate to severe stroke (16-20) 1·66 0.61, 4.50 0·3 2·64 0.93, 7.48 0·067
    Severe stroke (21-42) 4·12 1.56, 10.9 0·004 7·40 2.56, 21.4 <0·001
1 HR = Hazard Ratio, CI = Confidence Interval

Report statistics in line

Tables are important but sometimes we still need to report a result in-line in a report. This is especially true when explaining the results of regression models for reader’s understanding.

With the inline_text() function, any data reported in the tables using {gtsummary} can be extracted and reported in-line in R markdown

For example:

We want to report hazard ratio for age from table cox_mv

cox_mv
Characteristic HR1 95% CI1 p-value
Age 1·04 1.02, 1.06 <0·001
Gender
    Female
    Male 0·60 0.38, 0.96 0·033
Diabetes Mellitus
    No
    Yes 1·88 1.14, 3.10 0·013
Hypertention
    No
    Yes 0·60 0.30, 1.18 0·14
NIHSS Score
    No stroke symptoms (0)
    Minor stroke (1-4) 0·68 0.25, 1.88 0·5
    Moderate stroke (5-15) 2·20 0.88, 5.53 0·092
    Moderate to severe stroke (16-20) 2·64 0.93, 7.48 0·067
    Severe stroke (21-42) 7·40 2.56, 21.4 <0·001
1 HR = Hazard Ratio, CI = Confidence Interval

For every 1 year increment in age, the hazard of dying from stroke increase by 4%, 1·04 (95% CI 1·02, 1·06; p<0·0001).

Conclusion

  1. Tables serves as a tool for communicating discrete data or direct comparison within a report
  2. The {gtsummary} package assists researchers in producing reproducible presentation ready tables